23 research outputs found

    Privacy-Aware MMSE Estimation

    Full text link
    We investigate the problem of the predictability of random variable YY under a privacy constraint dictated by random variable XX, correlated with YY, where both predictability and privacy are assessed in terms of the minimum mean-squared error (MMSE). Given that XX and YY are connected via a binary-input symmetric-output (BISO) channel, we derive the \emph{optimal} random mapping PZ∣YP_{Z|Y} such that the MMSE of YY given ZZ is minimized while the MMSE of XX given ZZ is greater than (1−ϵ)var(X)(1-\epsilon)\mathsf{var}(X) for a given ϵ≥0\epsilon\geq 0. We also consider the case where (X,Y)(X,Y) are continuous and PZ∣YP_{Z|Y} is restricted to be an additive noise channel.Comment: 9 pages, 3 figure

    Notes on Information-Theoretic Privacy

    Full text link
    We investigate the tradeoff between privacy and utility in a situation where both privacy and utility are measured in terms of mutual information. For the binary case, we fully characterize this tradeoff in case of perfect privacy and also give an upper-bound for the case where some privacy leakage is allowed. We then introduce a new quantity which quantifies the amount of private information contained in the observable data and then connect it to the optimal tradeoff between privacy and utility.Comment: The corrected version of a paper appeared in Allerton 201

    Contraction of Locally Differentially Private Mechanisms

    Full text link
    We investigate the contraction properties of locally differentially private mechanisms. More specifically, we derive tight upper bounds on the divergence between PKP\mathsf{K} and QKQ\mathsf{K} output distributions of an ε\varepsilon-LDP mechanism K\mathsf{K} in terms of a divergence between the corresponding input distributions PP and QQ, respectively. Our first main technical result presents a sharp upper bound on the χ2\chi^2-divergence χ2(PK∥QK)\chi^2(P\mathsf{K}\|Q\mathsf{K}) in terms of χ2(P∥Q)\chi^2(P\|Q) and ε\varepsilon. We also show that the same result holds for a large family of divergences, including KL-divergence and squared Hellinger distance. The second main technical result gives an upper bound on χ2(PK∥QK)\chi^2(P\mathsf{K}\|Q\mathsf{K}) in terms of total variation distance TV(P,Q)\mathsf{TV}(P, Q) and ε\varepsilon. We then utilize these bounds to establish locally private versions of the van Trees inequality, Le Cam's, Assouad's, and the mutual information methods, which are powerful tools for bounding minimax estimation risks. These results are shown to lead to better privacy analyses than the state-of-the-arts in several statistical problems such as entropy and discrete distribution estimation, non-parametric density estimation, and hypothesis testing

    Bottleneck Problems: Information and Estimation-Theoretic View

    Full text link
    Information bottleneck (IB) and privacy funnel (PF) are two closely related optimization problems which have found applications in machine learning, design of privacy algorithms, capacity problems (e.g., Mrs. Gerber's Lemma), strong data processing inequalities, among others. In this work, we first investigate the functional properties of IB and PF through a unified theoretical framework. We then connect them to three information-theoretic coding problems, namely hypothesis testing against independence, noisy source coding and dependence dilution. Leveraging these connections, we prove a new cardinality bound for the auxiliary variable in IB, making its computation more tractable for discrete random variables. In the second part, we introduce a general family of optimization problems, termed as \textit{bottleneck problems}, by replacing mutual information in IB and PF with other notions of mutual information, namely ff-information and Arimoto's mutual information. We then argue that, unlike IB and PF, these problems lead to easily interpretable guarantee in a variety of inference tasks with statistical constraints on accuracy and privacy. Although the underlying optimization problems are non-convex, we develop a technique to evaluate bottleneck problems in closed form by equivalently expressing them in terms of lower convex or upper concave envelope of certain functions. By applying this technique to binary case, we derive closed form expressions for several bottleneck problems

    Privacy-Aware Guessing Efficiency

    Full text link
    We investigate the problem of guessing a discrete random variable YY under a privacy constraint dictated by another correlated discrete random variable XX, where both guessing efficiency and privacy are assessed in terms of the probability of correct guessing. We define h(PXY,ϵ)h(P_{XY}, \epsilon) as the maximum probability of correctly guessing YY given an auxiliary random variable ZZ, where the maximization is taken over all PZ∣YP_{Z|Y} ensuring that the probability of correctly guessing XX given ZZ does not exceed ϵ\epsilon. We show that the map ϵ↦h(PXY,ϵ)\epsilon\mapsto h(P_{XY}, \epsilon) is strictly increasing, concave, and piecewise linear, which allows us to derive a closed form expression for h(PXY,ϵ)h(P_{XY}, \epsilon) when XX and YY are connected via a binary-input binary-output channel. For (Xn,Yn)(X^n, Y^n) being pairs of independent and identically distributed binary random vectors, we similarly define h‾n(PXnYn,ϵ)\underline{h}_n(P_{X^nY^n}, \epsilon) under the assumption that ZnZ^n is also a binary vector. Then we obtain a closed form expression for h‾n(PXnYn,ϵ)\underline{h}_n(P_{X^nY^n}, \epsilon) for sufficiently large, but nontrivial values of ϵ\epsilon.Comment: ISIT 201

    Information Extraction Under Privacy Constraints

    Full text link
    A privacy-constrained information extraction problem is considered where for a pair of correlated discrete random variables (X,Y)(X,Y) governed by a given joint distribution, an agent observes YY and wants to convey to a potentially public user as much information about YY as possible without compromising the amount of information revealed about XX. To this end, the so-called {\em rate-privacy function} is introduced to quantify the maximal amount of information (measured in terms of mutual information) that can be extracted from YY under a privacy constraint between XX and the extracted information, where privacy is measured using either mutual information or maximal correlation. Properties of the rate-privacy function are analyzed and information-theoretic and estimation-theoretic interpretations of it are presented for both the mutual information and maximal correlation privacy measures. It is also shown that the rate-privacy function admits a closed-form expression for a large family of joint distributions of (X,Y)(X,Y). Finally, the rate-privacy function under the mutual information privacy measure is considered for the case where (X,Y)(X,Y) has a joint probability density function by studying the problem where the extracted information is a uniform quantization of YY corrupted by additive Gaussian noise. The asymptotic behavior of the rate-privacy function is studied as the quantization resolution grows without bound and it is observed that not all of the properties of the rate-privacy function carry over from the discrete to the continuous case.Comment: 55 pages, 6 figures. Improved the organization and added detailed literature revie
    corecore